Alternate Cutoff Values and DFIT Tests of Measurement Invariance
نویسندگان
چکیده
Likert scales are routinely used in educational and psychological research as measures of constructs of interest. If sound scale development procedures are followed, the resulting scale can reliably and validly measure a construct. However, if a given scale is used to make comparisons among different populations of respondents (e.g., cultures; Riordan & Vandenberg, 1994), over time in longitudinal measurements (Golembiewski, Billingsley, & Yeager, 1976), or across different mediums of data collection (Ployhart, Weekley, Holtz, & Kemp, 2003), measurement invariance must be established before meaningful comparisons in observed data can be made (Raju, Laffitte, & Byrne, 2002; Taris, Bok, & Meijer, 1998; Vandenberg, 2002). Recently, IRT methods of establishing measurement invariance for Likert data have gained acceptance. IRT methods of establishing measurement invariance typically have used the nomenclature of differential item functioning (DIF). DIF is said to occur when the relationship between levels of examinees’ latent trait ( ) and the probability of responses for a particular item differ between two groups (Camilli & Shepard, 1994). The Differential Functioning of Items and Tests (DFIT) framework (Raju, van der Linden, & Fleer, 1995) has been advanced for assessing both DIF and differential test functioning (DTF). Although the DFIT methodology is relatively new, it has been used in several studies published in prestigious journals (e.g., Collins, Raju, & Edwards, 2000; Donovan, Drasgow, & Probst, 2000; Ellis & Mead, 2000; Facteau & Craig, 2001; Flowers, Oshima, & Raju, 1999; Maurer, Raju, & Collins, 1999; Raju, et al., 2002). Articles published in these journals (including Applied Psychological Measurement, Educational and Psychological Measurement, and Journal of Applied Psychology), lend a high level of credibility to the DFIT methodology. Despite its high profile use in the past few years, the DFIT methodology is still under development. Few studies have been published that examine the efficacy of the DFIT program for detecting DIF in Likert data. More importantly, DFIT relies on non-parametric cutoff values for its indices in order to evaluate DIF. The recommended cutoff values have been changed substantially in recent years leading to some questions as to the optimal cutoff value that should be used. In this study, we simulated data with known DIF in order to the efficacy of the DFIT program for detecting DIF using several potentially promising cutoff values.
منابع مشابه
Practical Implications of Using Different Tests of Measurement Invariance for Polytomous Measures
Using male/female and Caucasian/African American comparison groups, this study examined the practical ramifications of using two IRT-based analytic methods, DFIT and the Likelihood Ratio Test (LRT), in assessing the measurement invariance of a 21-item leadership development scale under ten sample size conditions (e.g., 200, 500, & 1000). In nine of ten conditions, the LRT identified multiple it...
متن کاملSensitivity of DFIT Tests of Measurement Invariance for Likert Data
Likert scales are routinely used in educational and psychological research as measures of constructs of interest. If sound scale development procedures are followed, the resulting scale can reliably and validly measure a construct. However, if a given scale is used to make comparisons among different populations of respondents (e.g., cultures; Riordan & Vandenberg, 1994), over time in longitudi...
متن کاملSame Question, Different Answers: CFA and Two IRT Approaches to Measurement Invariance
The effectiveness of confirmatory factor analytic (CFA) and item response theory (IRT) methods of assessing measurement invariance were investigated using simulated data with a known lack of invariance. Across all study conditions, IRT likelihood ratio (LR) tests consistently outperformed both CFA and IRT differential functioning of items and tests (DFIT) analyses in terms of detecting a lack o...
متن کاملA Simulation Study Comparing Two Methods Of Evaluating Differential Test Functioning (DTF): DFIT and the Mantel-Haenszel/Liu-Agresti Variance
This study uses simulated data to compare two methods of calculating Differential Test Functioning (DTF): Raju’s DFIT, a parametric method that measures the squared difference between two Test Characteristic Curves (Raju, van der Linden & Fleer, 1995), and a variance estimator based on the Mantel-Haenszel/Liu-Agresti method, a non-parametric method enabled in the DIFAS (Penfield, 2005) program....
متن کاملComparing results of an exact vs. an approximate (Bayesian) measurement invariance test: a cross-country illustration with a scale to measure 19 human values
One of the most frequently used procedures for measurement invariance testing is the multigroup confirmatory factor analysis (MGCFA). Muthén and Asparouhov recently proposed a new approach to test for approximate rather than exact measurement invariance using Bayesian MGCFA. Approximate measurement invariance permits small differences between parameters otherwise constrained to be equal in the ...
متن کامل